Search CORE

1,425 research outputs found

Recommended from our members

Auditory-based processing of communication sounds

Author: Walters Thomas C.
Publication venue: University of Cambridge
Publication date: 07/06/2011
Field of study

This thesis examines the possible benefits of adapting a biologically-inspired model of human auditory processing as part of a machine-hearing system. Features were generated by an auditory model, and used as input to machine learning systems to determine the content of the sound. Features were generated using the auditory image model (AIM) and were used for speech recognition and audio search. AIM comprises processing to simulate the human cochlea, and a ‘strobed temporal integration’ process which generates a stabilised auditory image (SAI) from the input sound. The communication sounds which are produced by humans, other animals, and many musical instruments take the form of a pulse-resonance signal: pulses excite resonances in the body, and the resonance following each pulse contains information both about the type of object producing the sound and its size. In the case of humans, vocal tract length (VTL) determines the size properties of the resonance. In the speech recognition experiments, an auditory filterbank was combined with a Gaussian fitting procedure to produce features which are invariant to changes in speaker VTL. These features were compared against standard mel-frequency cepstral coefficients (MFCCs) in a size-invariant syllable recognition task. The VTL-invariant representation was found to produce better results than MFCCs when the system was trained on syllables from simulated talkers of one range of VTLs and tested on those from simulated talkers with a different range of VTLs. The image stabilisation process of strobed temporal integration was analysed. Based on the properties of the auditory filterbank being used, theoretical constraints were placed on the properties of the dynamic thresholding function used to perform strobe detection. These constraints were used to specify a simple, yet robust, strobe detection algorithm. The syllable recognition system described above was then extended to produce features from profiles of the SAI and tested with the same syllable database as before. For clean speech, performance of the features was comparable to that of those generated from the filterbank output. However when pink noise was added to the stimuli, performance dropped more slowly as a function of signal-to-noise ratio when using the SAI-based AIM features, than when using either the filterbank-based features or the MFCCs, demonstrating the noise-robustness properties of the SAI representation. The properties of the auditory filterbank in AIM were also analysed. Three models of the cochlea were considered: the static gammatone filterbank, dynamic compressive gammachirp (dcGC) and the pole-zero filter cascade (PZFC). The dcGC and gammatone are standard filterbank models, whereas the PZFC is a filter cascade, which more accurately models signal propagation in the cochlea. However, while the architecture of the filterbanks is different, they have all been successfully fitted to psychophysical masking data from humans. The abilities of the filterbanks to measure pitch strength were assessed, using stimuli which evoke a weak pitch percept in humans, in order to ascertain whether there is any benefit in the use of the more computationally efficient PZFC. Finally, a complete sound effects search system using auditory features was constructed in collaboration with Google research. Features were computed from the SAI by sampling the SAI space with boxes of different scales. Vector quantization (VQ) was used to convert this multi-scale representation to a sparse code. The ‘passive-aggressive model for image retrieval’ (PAMIR) was used to learn the relationships between dictionary words and these auditory codewords. These auditory sparse codes were compared against sparse codes generated from MFCCs, and the best performance was found when using the auditory features

Apollo (Cambridge)

Wavenet based low rate speech coding

Author: Kleijn W. Bastiaan
Lim Felicia S. C.
Luebs Alejandro
Skoglund Jan
Stimberg Florian
Walters Thomas C.
Wang Quan
Publication venue
Publication date: 01/12/2017
Field of study

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative model and show that approximating the signal waveform incurs a large rate penalty. Our experiments confirm the high performance of the WaveNet based coder and show that the speech produced by the system is able to additionally perform implicit bandwidth extension and does not significantly impair recognition of the original speaker for the human listener, even when that speaker has not been used during the training of the generative model.Comment: 5 pages, 2 figure

arXiv.org e-Print Archive

Crossref

The Zwicky Transient Facility: Surveys and Scheduler

Author: Barlow Tom
Bellm Eric C.
Feindt Ulrich
Goobar Ariel
Graham Matthew J.
Kulkarni Shrinivas R.
Kupfer Thomas
Ngeow Chow-Choong
Nugent Peter
Ofek Eran
Prince Thomas A.
Riddle Reed
Walters Richard
Ye Quan-Zhi
Publication venue: 'IOP Publishing'
Publication date: 06/05/2019
Field of study

We present a novel algorithm for scheduling the observations of time-domain imaging surveys. Our Integer Linear Programming approach optimizes an observing plan for an entire night by assigning targets to temporal blocks, enabling strict control of the number of exposures obtained per field and minimizing filter changes. A subsequent optimization step minimizes slew times between each observation. Our optimization metric self-consistently weights contributions from time-varying airmass, seeing, and sky brightness to maximize the transient discovery rate. We describe the implementation of this algorithm on the surveys of the Zwicky Transient Facility and present its on-sky performance.Comment: Published in PASP Focus Issue on the Zwicky Transient Facility (https://dx.doi.org/10.1088/1538-3873/ab0c2a). 13 Pages, 11 Figure

arXiv.org e-Print Archive

eScholarship - University of California

Caltech Authors

A Model for Assessing the Visual Resources of River Basins as an Aid to Making Landuse Planning Decisions

Author: Davis Molly M.
Elliot Cindy C.
Meshako Diane S.
Nieman Thomas J.
Walters David
Publication venue: UKnowledge
Publication date: 01/07/1986
Field of study

The visual quality of a river basin and its associated properties can be identified, evaluated and integrated into the landscape planning process. The model developed provides a quantitative methodology for determining visual quality on the basis of available Geographic Information System factors. These factors are utilized to develop the preference attributes, COLOR, FORM, TEXTURE and LINE, which are associated with the assessment of visual quality. The preference attributes are then combined through a decision making process into a continuum of DISTINCTIVE, GOOD, AVERAGE and MINIMAL visual quality and is expressed digitally in map format. By providing visual quality information in a digital format it can be treated as a discrete component of the planning process similar to physical, cultural and economic attributes

University of Kentucky

Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder

Author: Gârbacea Cristina
Li Yazhe
Lim Felicia S C
Luebs Alejandro
Oord Aäron van den
Vinyals Oriol
Walters Thomas C
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/10/2019
Field of study

In order to efficiently transmit and store speech signals, speech codecs create a minimally redundant representation of the input signal which is then decoded at the receiver with the best possible perceptual quality. In this work we demonstrate that a neural network architecture based on VQ-VAE with a WaveNet decoder can be used to perform very low bit-rate speech coding with high reconstruction quality. A prosody-transparent and speaker-independent model trained on the LibriSpeech corpus coding audio at 1.6 kbps exhibits perceptual quality which is around halfway between the MELP codec at 2.4 kbps and AMR-WB codec at 23.05 kbps. In addition, when training on high-quality recorded speech with the test speaker included in the training set, a model coding speech at 1.6 kbps produces output of similar perceptual quality to that generated by AMR-WB at 23.05 kbps.Comment: ICASSP 201

arXiv.org e-Print Archive

Crossref

Membrane Association and Destabilization by Aggregatibacter Actinomycetemcomitans Leukotoxin Requires Changes in Secondary Structures

Author: Baranwal Somesh
Boesze-Battaglia Kathleen
Brown Angela C.
Du Yurong
Edrington Thomas C.
Lally Jennifer T.
Walters Michael J.
Publication venue: ScholarlyCommons
Publication date: 01/10/2013
Field of study

Aggregatibacter actinomycetemcomitans is a common inhabitant of the upper aerodigestive tract of humans and non-human primates and is associated with disseminated infections, including lung and brain abscesses, pediatric infective endocarditis in children, and localized aggressive periodontitis. A. actinomycetemcomitans secretes a repeats-in-toxin protein, leukotoxin, which exclusively kills lymphocyte function-associated antigen-1-bearing cells. The toxin\u27s pathological mechanism is not fully understood; however, experimental evidence indicates that it involves the association with and subsequent destabilization of the target cell\u27s plasma membrane. We have long hypothesized that leukotoxin secondary structure is strongly correlated with membrane association and/or destabilization. In this study, we tested this hypothesis by analyzing lipid-induced changes in leukotoxin conformation. Upon incubation of leukotoxin with lipids that favor leukotoxin-membrane association, we observed an increase in leukotoxin α-helical content that was not observed with lipids that favor membrane destabilization. The change in leukotoxin conformation after incubation with these lipids suggests that membrane binding and membrane destabilization have distinct secondary structural requirements, suggesting that they are independent events. These studies thus provide insight into the mechanism of cell damage that leads to disease progression by A. actinomycetemcomitans

PubMed Central

ScholarlyCommons@Penn

Report: The 62nd Annual Caddo Conference and 27th Annual East Texas Archeological Conference, Tyler, Texas, February 28 and 29, 2020

Author: Eppich Keith
Guderjan Thomas H.
Hanratty C. Colleen
Regnier Amanda
Sills E. Cory
Simmons Christy
Souther Anthony
Walters Mark
Publication venue: SFA ScholarWorks
Publication date: 01/01/2021
Field of study

The 62nd Caddo Conference and 27th East Texas Archeological Conference was held at the University Center on the campus of the University of Texas at Tyler on February 28 and 29, 2020. The conference was dedicated to the rebuilding of public facilities at Caddo Mounds State Historic Site. These facilities had been destroyed by a tornado in 2019. The conference organizers were Thomas Guderjan, Colleen Hanratty, Cory Sills, Christy Simmons (University of Texas at Tyler), Keith Eppich (Tyler Junior College), Anthony Souther (Caddo Mounds State Historic Site), Amanda Regnier (Oklahoma Archeological Survey), Mark Walters (Texas Historical Commission Steward). Sponsors included The Center for Social Science Research and Department of Social Sciences, University of Texas at Tyler, Humanities Texas, Kevin Stingley, Arkansas Archeological Survey, Beta Analytic, Inc., Friends of Northeast Texas Archeology, East Texas Archeological Society, Maya Research Program, Tejas Archeology, Tyler Junior College, Gregg County Historical Museum, the American Indian Heritage Day of Texas organization, and the Caddo Nation. Before the formal program began, a preconference gathering was held at ETX Brewing Company at 221 S Broadway Avenue in Tyler on Thursday evening, February 27th. Approximately 250 people participated in the joint conferences

SFA ScholarWorks

Explorations in anatomy: the remains from Royal London Hospital

Author: Armitage P.
Auden R. R.
Bailey J. B.
Basil-Holmes I. M.
Clark-Kennedy A. E.
Cooper A.
Fowler L
Howard J.
KalofL.
Lansbury C.
Millard A.
Morris J.
Morris J.
Pipe A.
Rixson D.
Thomas R.
Walters A. N.
Wise S.
Publication venue
Publication date: 30/06/2014
Field of study

This paper considers the faunal remains from recent excavations at the Royal London Hospital. The remains date to the beginning of the 19th century and offer an insight into the life of the hospital's patients and practices of the attached medical school. Many of the animal remains consist of partially dissected skeletons, including the unique finds of Hermann's tortoise (Testudo hermanni) and Cercopithecus monkey. The hospital diet and developments in comparative anatomy are discussed by integrating the results with documentary research. They show that zooarchaeological study of later post-medieval material can significantly enhance our understanding of the exploitation of animals in this perio

CLoK

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Methods for the thematic synthesis of qualitative research in systematic reviews

Abstract Background There is a growing recognition of the value of synthesising qualitative research in the evidence base in order to facilitate effective and appropriate health care. In response to this, methods for undertaking these syntheses are currently being developed. Thematic analysis is a method that is often used to analyse data in primary qualitative research. This paper reports on the use of this type of analysis in systematic reviews to bring together and integrate the findings of multiple qualitative studies. Methods We describe thematic synthesis, outline several steps for its conduct and illustrate the process and outcome of this approach using a completed review of health promotion research. Thematic synthesis has three stages: the coding of text 'line-by-line'; the development of 'descriptive themes'; and the generation of 'analytical themes'. While the development of descriptive themes remains 'close' to the primary studies, the analytical themes represent a stage of interpretation whereby the reviewers 'go beyond' the primary studies and generate new interpretive constructs, explanations or hypotheses. The use of computer software can facilitate this method of synthesis; detailed guidance is given on how this can be achieved. Results We used thematic synthesis to combine the studies of children's views and identified key themes to explore in the intervention studies. Most interventions were based in school and often combined learning about health benefits with 'hands-on' experience. The studies of children's views suggested that fruit and vegetables should be treated in different ways, and that messages should not focus on health warnings. Interventions that were in line with these suggestions tended to be more effective. Thematic synthesis enabled us to stay 'close' to the results of the primary studies, synthesising them in a transparent way, and facilitating the explicit production of new concepts and hypotheses. Conclusion We compare thematic synthesis to other methods for the synthesis of qualitative research, discussing issues of context and rigour. Thematic synthesis is presented as a tried and tested method that preserves an explicit and transparent link between conclusions and the text of primary studies; as such it preserves principles that have traditionally been important to systematic reviewing.</p

Crossref

City Research Online

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UCL Discovery

National Centre for Research Methods: NCRM EPrints Repository